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DESCRIPTION 

IMPROVED FINDING OF TV ANYTIME WEB SERVICES 

This invention relates to finding TV Anytime web services using a server 
based file with a well-known name, location and structure. This invention also 
relates to a method for aggregating and categorising TV Anytime web 
services. 

The TV Anytime Forum is in the process of standardising a set of web 
services which allow TV Anytime clients (e.g. PDRs - Personal Digital 
Recorders) to retrieve TV Anytime data (e.g. program schedules, descriptions, 
etc.) from TV Anytime IP (Internet Protocol) servers. Different types of TV 
Anytime web services can be offered from a given web site and can have 
different, unrelated URLs (Uniform Resource Locators). The object of this 
invention is to allow a PDR to automatically find out whether an arbitrary web 
site offers TV Anytime services, and if so which types of services it offers. 

1. State of the art 

TV Anytime fhttp://www.tv-anvtime.org ) has not specified mechanisms 
for discovering TV Anytime web services. The following work is relevant: 

1.1 Use of DNS for finding a TV Anytime service for a particular program 
identifier 

This mechanism is described in the TV Anytime Content Referencing 
specification mp://tva(a>ftp.hbc.co.uk/pub /Snecifications/SP004v1 1 .zip 
password "tva"). Given a CRID (Content Reference Identifier), DNS (Domain 
Name Service) is used to request the machine name and port of a server 
which is able to provide a TV Anytime service that offers particular information 
about that CRID. However, once this service has been found it offers no 
information on the presence or otherwise of other TV Anytime services on the 
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same server. Moreover, not all TV Anytime service types can be found using 
this deterministic mechanism. For example, if the PDR wishes to find a server 
that allows the user to search for programmes, then DNS is not helpful. 

1.2 Use of UDDI (Universal Description, Discovery and Integration) 

UDDI (http://www.uddi.org ) represents one technology for facilitating the 



that provide a type of web service "Yellow Pages". By querying the repository 
a device is able to find web services which match a certain technical 
description and perhaps match some other taxonomy classification. The 
approach provides a solution to the problem, "How do I find a list of services 
that provide a certain service type and are TV Anytime compliant?". 

1 .3 Use of web robots / spiders to index a web site 

For traditional static web content (i.e. HTML pages) a web robot can be 
used to find and index the content of a site. The information gained is stored 
and used for tools such as search engines. However, this is not well suited for 
direct use by a PDR (it is a slow process, involving multiple network 
transactions), nor is it particularly useful when the content is dynamically 
generated by a web service. Although a method could be conceived whereby 
a TV Anytime search engine blindly tries to discover services by testing their 
behaviour, this would be prohibitively slow, error prone and not guaranteed to 
find all the TV Anytime services provided by that site. 

1.4 Use of a robots.txt file (http://www.robotstxt.org/wc/robots.htmn 

By placing a robots.txt file in a well-known place on a server (e.g. 
h tt p ://f oo . co m/ro b ots . txt ) a server is able to specify a set of rules for the whole 
web site, which compliant web robots will obey. Whilst not directly relevant to 
TV Anytime, this is an example of the use of placing a file (with well-known 
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name, structure and location) on a web server to provide information about the 
web site that can be used both automatically and manually. 

2. The problem 

This invention provides a solution to the problem, "How do I know if this 
web-site offers any TV Anytime services, and if it does where are they?" A 
solution is needed for two reasons. Firstly, a PDR may be aware of a particular 
web site (i.e, machine name and port number), as a result of any number of 
processes (see section 3). It would be useful if the PDR can automatically find 
whether TV Anytime web services are available. Having established this, the 
PDR should be able to deduce the types of services offered and where they 
are offered. Secondly, there is likely to be a market for third party sites that 
categorise and index the available TVAnytime services (the TV Anytime 
equivalent of a web directory or search engine). By providing a standardised 
description mechanism a web tool is able to automatically discover and 
categorise TV Anytime services without the need for human intervention. 

Once the PDR has established the existence of TV Anytime services it 
needs to find out the following information about each of those services: 

• the location where that service is being offered, 

• the type of TV Anytime service being offered, 

• the technical compliance of that service, 

• and the version number of that TV Anytime service. 



25 3. What is proposed 

The mechanism proposed is to place a file on the server, which has a 
standardised structure containing the necessary information. This file has a 
well-known name and is placed at the entry point to the website, thus allowing 
30 a PDR to retrieve the file automatically. The invention specifically includes the 
use of the WS-lnspection standard to define the file structure and name of the 
file (inspection, wsil). 
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The invention assumes that the PDR already has knowledge of a 
particular web site. The domain name could have been obtained by the 
following mechanisms: 

1. The user has heard of a TV Anytime service through some other 
medium (e.g. recommendation or advertising) and manually enters the domain 
name into their PDR. 

2. The PDR might support a web browser to allow the user to web surf. 
I t would be rel ativtely-inexpeRsi v o fo r a PDR-4o-attempMtrdown load the TV " 
Anytime file (if any) of the web sites visited by the user. 

3. The DNS mechanism discussed in section 1 .2. 

4. A PDR might receive CRIDs from a number of different sources (e.g. 
embedded in the video stream, as a result of searches, as a result of a 
program recommendation, or as a result of a remotely generated request to 
record a program). The authority name can be extracted from CRIDs and used 
as the domain name in an attempt to find a TV Anytime server file. 

In addition, a business model is proposed, whereby third parties can 
offer search and categorisation services specifically for TV Anytime web 
servers. This can be viewed as analogous to the search and directory engines 
(such as Google, Yahoo, etc.) used to discover HTML based web sites. To 
create such a website, a method for how the third party can automatically 
aggregate this information is described. A specific use of WS-lnspection 
specification is proposed that allows third parties to spider between TV 
Anytime web servers in an efficient fashion. 

According to a first aspect of the present invention, there is provided a 
method for finding TV Anytime web services comprising querying a known 
address, obtaining a file from said address, said file having a predefined 
structure, and parsing said file to obtain URLs for TV Anytime web services 

According to a second aspect of the present invention, there is provided 
apparatus for finding TV Anytime web services comprising communicating 
means for querying via a network a known address and for obtaining a file 
from said address, said file having a predefined structure, and processing 
means for parsing said file to obtain URLs for TV Anytime web services. 
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According to a third aspect of the present invention, there is provided a 
method for supplying a file via a network comprising receiving a query at a 
known address, and supplying a file in response to said query, said file having 
a predetermined structure. 

According to a fourth aspect of the present invention, there is provided a 
server system for supplying a file via a network comprising receiving means for 
receiving a query at a known address, and supplying means for supplying a 
file in response to said query, said file having a predetermined structure. 

According to a fifth aspect of the present invention, there is provided a 
method of spidering websites comprising recursively addressing a URL for a 
non-HTML web service description file, parsing said file to obtain further URLs 
for non-HTML web service description files, and recording said further URLs. 

According to a sixth aspect of the present invention, there is provided a 
server system for supplying URLs for TV Anytime web services via a network 
comprising receiving means for receiving a query, supplying means for 
supplying one or more URLs for TV Anytime web services in response to said 
query, and storing means for storing a categorised list of TV Anytime web 
services. 

If a web site does offer TV Anytime services it places a file with a well 
known name at the entry point to that web site. To obtain the file the PDR 
makes an HTTP GET request to the following URL. http:IKmachine 
name>:<port number>Kwell known file name> The port number is optional 
and typically would not be included. The exception is case 3 above, where the 
DNS mechanism will explicitly return a port number as well as a machine 
name. A machine-readable document (this could be XML but does not have to 
be) is returned which indicates the presence of TV Anytime services by 
containing references (URLs) to one or more service description files. This 
invention does not mandate the type of service description file that should be 
used, but specifically includes the use of WSDL (Web Services Description 
Language) and UDDI to provide the four pieces of information listed in section 
2. Each service description file may, in turn, provide information on more than 
one TV Anytime service depending on how the web site chooses to group their 
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web services. The document may also give the URLs of other related TV 
Anytime server files to facilitate the discovery and linking together of new 
services. The mechanism has the following advantages: 

• Lightweight and easy for a web site to implement. 

• Allows a new TV Anytime web server to describe itself without having 
to register with a third party. 

• Facilitates discovery and indexing mechanisms for use by a web robot 



Although this provides a means by which a web site can identify 
whether it has TV Anytime services (and if so where they are), this is only 
useful if the client has prior knowledge of the existence of that web site. In 
order to find specific TV Anytime services, the only means available to a client 
device is to conduct an exhaustive search (spidering) of all web sites and to 
use the mechanism described above to test each one for the existence of TV 
Anytime services. Such a process is computationally expensive and certainly 
not feasible for the types of clients envisaged (digital TV receivers, PDAs, 
etc.). 

Therefore it is necessary to alter the searching process to relieve the 
computational burden placed on the client. This can be achieved by the use of 
a third party web site containing categorised web services. Since the vast 
majority of web sites will not offer TV Anytime web services, the searching 
process is altered to enable spidering of the web in a way that efficiently 
discovers TV Anytime web servers. 

It is proposed that a third party is responsible for conducting the 
spidering process. There are no restrictions on who this third party might be. 
Some examples are: a broadcaster wishing to offer a value-adding service for 
TV Anytime clients; a CE manufacturer wishing to improve the functionality of 
the equipment they manufacture; and a specialist interest web site wishing to 
provide TV Anytime information to its users. Since a powerful computer can do 
the spidering the computational expense is less problematic. The third party 
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maintains a directory of all the TV Anytime web services it has found. This 
directory might offer an HTML interface to allow users to find and browse the 
discovered TV Anytime services. The directory can add value by categorising 
and grouping the services in certain ways that help the user find the services 
they want. 

In order for the consuming client (i.e. TV Anytime device, such as a 
digital TV receiver) to be able to automatically retrieve the information from the 
machine hosting the third party directory, a standard means of describing the 
list of discovered services is necessary. Such a description could be agreed by 
some standards body (such as the TV Anytime Forum). Alternatively, if the 
directory service is hosted by a CE manufacturer, they may choose to 
implement a private description format since they control both the client 
implementation (i.e. the CE device) and the directory server. 

Another way this invention could be exploited would be for the directory 
service to offer a single integrated TV Anytime web service, giving access to 
all the data available from the services that have been discovered. It could 
then offer the aggregated data through a single TV Anytime web service. 

The efficient spidering of TV Anytime services is based upon the 
mechanism described above of using a structured file (in a well-known 
location) to describe the TV Anytime services available from that server. Here, 
it is additionally proposed that this structured file is allowed to contain URLs 
(i.e. hyperlinks) to the description files on other TV Anytime web servers. In 
this way, a "web service spider" can be used to recursively find and download 
the structured file for many TV Anytime web sites. 

By spidering across standardised service location files, rather than 
HTML files, the search space is vastly reduced and the process made more 
efficient. The structured file is split into two sections - links and descriptions- 
both of which are optional. A structured file that contains only links can be 
used to represent a list of TV Anytime web services. This format can itself be 
used by the directory service as a means of describing all the services it has 
found. 
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Some additional restrictions regarding the way the description part of 
the structured file is formatted can be used to facilitate the process. 
Specifically, when describing a TV Anytime web service, the structured file 
should include the following information in its descriptions of the web services 
available at that siteran indication that the service is a TV Anytime servicejthe 
protocol version of the TV Anytime service;the types of TV Anytime services 
offered. This information must be present in the structured file itself and not by 
-4^^iqs-efH=efer-enee-(eTgT-a-rereieiice lo a detailed description ot that service). 
In this way, there is no need to download and parse other files in order to 
establish the existence of a TV Anytime sen/ice. Consequently, the amount of 
processing required at each node of the search space is also reduced, once 
again enabling more effective spidering of TV Anytime web services. 



4. Fields of application of the invention 



The invention applies to TV Anytime IP clients and servers. 

Clients. Any device that wishes to receive information related to TV 
programme schedules could use this invention. Typically this will be a 
Personal Digital Recorder or some other TV device (Integrated Digital TV, 
set-top-box, etc.) that wishes to display TV schedules to a user. However, any 
other network-enabled devices could also exploit the invention for the same 
purpose. These include Personal Computers, mobile phones, PDAs, etc. 

Servers. Any web server with the appropriate information can host a TV 
Anytime service. Most often this will be a broadcaster's web server, but also 
includes third party web sites providing specialised and enhanced metadata 
about TV programmes. 



5. An example of the invention 

The Web Services Inspection Language provides one standard method 
of specifying how to inspect a web site for available Web services. The 
WSInspection specification defines the locations on a Web site where you 
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could look for Web service descriptions. The following URLs give an overview 
and the specification of WS-lnspection: 

http://www-1 06.ibm.com/develoDerworks/webserv ices/librarv/ws-wsilove 

r/ 

http://www-106.ibm.com/developerworks/ webservices/librarv/ws-wsilspe 

c.html 

Figures 1 and 2 show a first embodiment of the invention of placing a 
file at the entry point of a website, Figure 1 illustrating the format of a possible 
WS-lnspection file and Figure 2 illustrating the format of the corresponding 
service description file. Figure 3 shows the steps involved in finding new TV 
Anytime services, using the files of Figures 1 and 2. 

Figure 4 shows a second embodiment with an improved WS-lnspection 
file. This file structure has two advantages over the WS-lnspection file of 
Figure 1 . Firstly a client device can establish directly from the file the existence 
of TV Anytime compliant web services without the need for further network 
transactions. Secondly the links to other TV Anytime WS-lnspection files 
enable spidering of TV Anytime web services. 




http://example.com/inspection.wsil 



<?xml version=«l.o« encoding-"UTF-8"?> /, « rm/« 

inspection xmlns-"http: //schemas . xmlsoap.org/ws/20 01/ ^/^P^?^ /wsdl/ll> 
xmlns:wsilwsdl-"http://schemas. xmlsoap. org^^ > 

<S <description ref erencedName space- "http : //schemas . xmlsoap . org/wsdl/" 
locations "http://example.com/tva_services.wsdl > 
<wsilwsdl: reference endpoint-"true"> ~ 

<wsilwsdl:referenced5ervice xmlns -.ns-" http://example.com/tva > 
« ns : TvaCookingService</wsilwsdl : ref erencedService> 
< /wsilwsdl : ref erence> 
< /description 
</service> 

<S <dIscript ion ref erencedName space- "http : //schemas . xmlsoap . org/wsdl/ " 
location- "http : //example . com/tva_services . wsdl > 
< wsilwsdl: reference endpoint= " true ■ > 

<wsilwsdl:referencedService xmlns :ns=« http://example.com/tva > 
ns : TvaMovieService</wsilwsdl : ref erencedService> 



</wsilwsdl :reference> 
< /description 

< ^ S Relerences to other groups of TV Anytime services could be inserted here 
/ inspect ion> ^ 
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<?xml version=«1.0" encoding="UTF-8"?> 
<definitions targetltfamespace-"http : //example, com/tva 

xmlns : tva= "http : //www . tv-anytime . org/2 001/ll/transport /wsdl 
xmlns : soap= "http : //schemas .xmlsoap . org/wsdl/ soap/ ■ 
xmlns- "http: //schemas. xmlsoap.org/wsdl/"> 
<import namespace="http://www. tv-anytime.org/200l/ll/transport/wsdl 

<service name-"TvaCookingService"> 

<port name-"get_Metadata_Cooking" binding-" tva : get_Reso lution_Port 
<soap: address location- "http : //example . com/cooking a /> ' 1 

</port> 

<port names " searchOn_Del ivery_Cooking n 

binding= " tva : s ear chOn_De liver y_Port ■ > 
<soap : address location="http : //example . com/cooking»/> 
</port> 
</service> 

oervice name- "TvaMovie Service" > 

<port narae="get_Metadata_Movies" binding- "tva:get_Metad ata_Port > 

<soap: address location-" http: //example .com/mov3.es«/> ■ 

</port> 

<port name-" searchOnJDe script ion_Movies« 

binding- " tva : searchOn_Descript ionJ?ort " > 

<soap : address location- "http : //example . com/movies" /> 

</port> 
</service> ' 
</def initions> 



s The namespace indicates 
^— compliance with TVA, 
along with the version 

f The port name gives the 
type of TVA services 
supported 



The entry points 
\ (URL) to the different 
constituent services 
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The steps involved in finding new TV Anytime services. 

1 / 3 




Ste^^ 

schedule listings, movie information, etc.). 

5 wlS^ 5=S means (such as ^^n^on 3). 

A JSSSred query from 1 to 2 (such as a SOAP request or an HTT^ueeJ. 
A structured ?esponse from 2 to 1 (such as a SOAP request or an HTTP request). 



1. 

2. 

3. 
4. 
5. 
6. 

To successfully find a TV Anytime web service the following sequence of requests (5) and 
responses (6) must occur. 

successful HTTP response containing the requested file (inspection wsil). If the ^erver ^ 
offerl nc .wet Services it will send back an HTTP 404 (file not found) response and the 

E 35?? EiSSSon for that endpoint. The exact mg-J"* 
E ' Sg th s "depends on the service description P^'^^t^ 

In this case we will assume that WSDL is being used To obta.n the. WSDL file, device 

makes an HTTP GET request to the server 2 for the file (e.g. 



http://example.com/inspection.wsil 




1 . The TV Anytime namespace, which indicates the version of the protocol being referenced. 

2. The endpointPresent attribute indicates that the TV Anytime service is actually available. 

3. When qualified by the namespace prefix ("tva:"), the implementedBinding elements indicate the 
types of TV Anytime services available. 

4. This is a link indicating the presence of a URL offering a structured file of the same format as this 
one. 

5. The present attribute indicates that at least one TV Anytime service is referenced in the document 
that is being linked to. 

Items 1-3 indicate how to use the WS-Ihspection description elements to reference TV Anytime 
services. The use of implementedBinding elements means that the spidering robots does not need to 
download a WSDL file (as given in the location attribute) to establish the presence of TV Anytime 
service. 

Items 4-5 indicate how links to other WS-Inspection documents are shown. By following these links 
other WS-Ihspection documents containing references to TV Anytime services will be found. 
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